| Midterm Exam  *\_\_\_\_\_ \_\_\_\_\_ \_\_\_\_ \_\_\_\_ \_\_\_\_\_\_\_\_\_\_ \_\_\_*  *| \_\_\_\_| \_\_\_\_/ \_\_\_/ \_\_\_| |\_\_\_ /\_\_\_ / \_ \*  *| \_| | \_|| | \\_\_\_ \ |\_ \ / / | | |*  *| |\_\_\_| |\_\_| |\_\_\_ \_\_\_) | \_\_\_) |/ /| |\_| |*  *|\_\_\_\_\_|\_\_\_\_\_\\_\_\_\_|\_\_\_\_/ |\_\_\_\_//\_/ \\_\_\_/*  EECS 370 Spring 2023: Introduction to Computer Organization |
| --- |

| You are to abide by the University of Michigan College of Engineering Honor Code. Please sign below to signify that you have kept the honor code pledge:  ***I have neither given nor received aid on this exam,  nor have I concealed any violations of the Honor Code.*** | | |
| --- | --- | --- |
| Signature: | *\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_* | |
| Name: | *\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_* | |
| Uniqname: | *\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_* | |
| Uniqname of person sitting to your ***Right***  **(**Write 丄 if you are at the end of the row) | | *\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_* |
| Uniqname of person sitting to your ***Left***  **(**Write 丄 if you are at the end of the row) | | *\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_* |

**Exam Directions:**

* You have **120 minutes** to complete the exam. There are **7** questions in the exam on **15** pages (double-sided). **Please flip through your exam to ensure you have all pages.**
* You must show your work to be eligible for partial credit!
* Write legibly and dark enough for the scanners to read your answers.
* **Write your uniqname on the line provided at the top of each page.**

**Exam Materials:**

* You are allotted **one** **8.5 x 11 double-sided** note sheet to bring into the exam room.
* You are allowed to use calculators that do not have an internet connection. All other electronic devices, such as cell phones or anything or calculators with an internet connection, are strictly forbidden.

| *Problem* | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| --- | --- | --- | --- | --- | --- | --- | --- |
| *Point Value* | 15 | 13 | 11 | 9 | 12 | 15 | 25 |

# Problem 1: Multiple Choice 15 points

Completely shade in the boxes with the correct answers. Select only 1 answer, unless specified by “**(FILL IN ALL THAT APPLY).**” *[1.5 points each]*

1. Which of the following is usually true of RISC ISAs, in comparison to CISC ISAs? (**FILL IN ALL THAT APPLY)**
   * Produce smaller programs
   * Support fewer instructions
   * Encode different instructions to different bit widths
   * Have simpler decoding logic
2. The error "program.o: unresolved external symbol 'int x'" would be reported during which of the below processes?
   * + Compilation
     + Assembly
     + Linking
     + Loading
     + Execution
3. Endianness determines which of the below (assume a byte addressable system)? (**FILL IN ALL THAT APPLY)**
   * + Which varieties of load/store instructions are supported in an ISA
     + Which bit read from a byte of memory is considered most significant
     + Which byte read from a word in memory is considered most significant
     + Which bits of a register are used to hold a single byte loaded from memory
4. LC2K has what kind of addressability?
   * Byte
   * Word
   * Big endian
   * Little endian
5. When assembling the following code LEGv8, what offset would the label func be replaced with?

b.le func

add X0, X3, X7

func lsl X2, X4, #3

* + 1
  + 2
  + 4
  + 8
  + It depends on what line in the program the instructions are placed

1. Which LC2K instructions calculate a memory address by adding a base value to an offset? (**FILL IN ALL THAT APPLY)**
   * add
   * lw
   * sw
   * jalr
   * beq
2. Increasing the number of registers in an ISA, while keeping the instruction size the same, generally: (**FILL IN ALL THAT APPLY)**
   * Decreases the number of loads and stores needed to execute a given program
   * Increases the number of opcodes that can be supported
   * Requires more bits be used to indicate each register
   * Results in lower ALU latency
3. What is the effect of running the following 2-line LC2K program?

.fill 3

halt

* + Nothing happens
  + A value is written into memory
  + A value is written into a register
  + The program infinite loops
  + The program has undefined behavior

1. Moving from a single-cycle processor to a multi-cycle processor is expected to reduce which of the following when running a given program? (**FILL IN ALL THAT APPLY)**
   * Cycle time
   * Number of instructions executed
   * Average cycles to execute each instruction
   * Average time to execute some instructions
2. Which of the following can be implemented without sequential logic? (**FILL IN ALL THAT APPLY)**
   * Mux
   * PC
   * ALU
   * Decoder
   * Register

# Problem 2: Mixed Signals 13 points

1. Convert each of the 8-bit hexadecimal numbers into its binary form and decimal form (both for treating the original number as a signed (two's-complement) and an unsigned value). *[2.5 points]*

| **Hexadecimal** | **Binary** | **Decimal (signed)** | **Decimal (unsigned)** |
| --- | --- | --- | --- |
| 0x2C |  |  |  |
| 0x98 |  |  |  |

1. Complete the timing diagram below for a D latch, a rising-edge triggered D flip-flop, and a delayed OR gate.
   * In the case of the latch, “clock” is the Gate signal.
   * Assume there is no meaningful delay for the latch and flip-flop.
   * In the case of the OR gate, “clock” is the other input and the gate has a delay of **2ns**.
   * If a value is unknown, indicate that clearly using the notation shown. *[3.5 points]*

**![](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR4XmP4//8/AwAI/AL+GwXmLwAAAABJRU5ErkJggg==)**

1. Draw a Moore-style finite state machine with one input (I) and one output (O). O should be set to 1 if and only if I had been set to 1 the previous 3 cycles. Write the value of "I" that triggers each transition next to the corresponding arrow, and the output of "O" where appropriate. Your state machine must have the minimum number of states needed to receive full credit. Don't forget to indicate a reset state. *[3.5 points]*
2. Consider an 8-bit floating point format based on the IEEE standard where the most significant is used for the sign, the next 3 bits are used for the exponent and the last 4 bits are used for the mantissa. The scheme uses “biased 3” to represent the exponent (rather than biased 127 used for a 32-bit IEEE floating point number) and has an implicit one just like the IEEE format. This scheme is called “VSF” (very short float).

| Sign | Exponent | Mantissa |
| --- | --- | --- |
| 7 | 6 5 4 | 3 2 1 0 |

Write the *binary* encoding of -0.75 as a VSF number. *[3.5 points]*  
  
0b\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

# Problem 3: Rough Root 11 points

Consider the following C function (on the left) that takes as input a non-negative integer n and returns the floor of its square root. A programmer translated it into ARM by hand, but unfortunately made some mistakes. Answer the questions on the next page. **Assume the mul instruction is supported and is used correctly on line 13.**

|  | C (correct, tested) |  | LEGv8 (buggy, hand-written) |  |
| --- | --- | --- | --- | --- |
| 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30 | unsigned flrsqrt(unsigned n) {  unsigned r = n;  unsigned l = 0;  unsigned ans = 0;  while ( l <= r ) {    unsigned m = (l + r) / 2;  unsigned m2 = m \* m;    if (m2 == n) {  return m;  } else if (m2 < n) {  l = m + 1;  ans = m;  } else {  r = m - 1;  }  }  return ans;  } | 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30 | flrsqrt:  mov X10, X0  mov X5, X10  mov X4, #0  mov X6, #0  b comp  loop:  cmp X4, X5  b.gt end  body:  add X3, X4, X5  lsr X7, X3, #1  mul X8, X7, X7  cmp X8, X10  b.ne elif  mov X0, X7  b ret  elif:  cmp X8, X10  b.le else  add X4, X7, #1  mov X6, X7  b loop  else:  sub X5, X7, #1  b loop  end:  mov X0, X7  ret:  br lr | 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30 |

1. Below is a list of the local variables declared in the C function. What registers did the programmer choose to store these values? Which is used as a scratch register? *[7 points]*

| Variable/Scratch | Register |
| --- | --- |
| n |  |
| r |  |
| l |  |
| ans |  |
| m |  |
| m2 |  |
| <scratch> |  |

1. There are two bugs after line 18 - one is an incorrect opcode, the other is an incorrect source register. Identify the two bugs and suggest a one line fix for each. *[4 points]*

Bug 1

Line #: \_\_\_\_\_\_\_\_\_

Replacement instruction: \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

Bug 2

Line #: \_\_\_\_\_\_\_\_\_

Replacement instruction: \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

# Problem 4: Some Assembly Required 9 points

Fill in the blanks of the object file. Recall that the 0x notation indicates that the answer is expected in *hexadecimal.* (That only applies to the blanks that start with 0x, all other answers should be in the format expected in project 2a.)

| PC | main.as | PC | main.obj |
| --- | --- | --- | --- |
| 0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22 | Main lw 0 1 Count  loop lw 0 6 One  sw 5 1 Stack  add 5 6 5  lw 0 6 Func1  jalr 6 7  lw 0 6 Neg1  add 5 6 5  lw 5 1 Stack  lw 0 6 One  sw 5 1 Stack  add 5 6 5  lw 0 6 func2  jalr 6 7  lw 0 6 Neg1  add 5 6 5  lw 5 1 Stack  add 1 6 1  beq 1 0 end  beq 0 0 loop  end halt  One .fill 1  func2 .fill Delta | 0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22 | \_\_\_\_\_\_\_ 2 7 12  8454144  8781845  15269888  3014661  0x\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_  24576000  8781824  3014661  11075584  8781845  15269888  3014661  8781846  24576000  8781824  3014661  11075584  917505  0x\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_  16842733  25165824  1  0  \_\_\_\_\_\_\_\_\_\_\_ T \_\_\_\_  \_\_\_\_\_\_\_\_\_\_\_ D \_\_\_\_  Stack \_\_\_\_\_\_\_ 0  Count U 0  \_\_\_\_\_\_\_\_\_\_\_ U 0  Neg1 U 0  \_\_\_\_\_\_\_\_\_\_\_ U 0  0 lw Count  1 lw One  1 \_\_\_\_\_\_\_\_\_ \_\_\_\_\_\_\_\_\_\_\_  2 sw Stack  4 lw Func1  6 lw Neg1  \_\_\_\_ \_\_\_\_\_\_\_\_ \_\_\_\_\_\_\_\_\_\_\_  9 lw One  \_\_\_\_ \_\_\_\_\_\_\_\_ \_\_\_\_\_\_\_\_\_\_\_  \_\_\_\_ \_\_\_\_\_\_\_\_ \_\_\_\_\_\_\_\_\_\_\_  14 lw Neg1  16 lw Stack |

# Problem 5: Call-ee-2-Save 12 points

Consider the following two LC2K files and answer the questions below.

| main.as (repeated from last problem) | funcs.as |
| --- | --- |
| Main lw 0 1 Count  loop lw 0 6 One  sw 5 1 Stack  add 5 6 5  lw 0 6 Func1  jalr 6 7  lw 0 6 Neg1  add 5 6 5  lw 5 1 Stack  lw 0 6 One  sw 5 1 Stack  add 5 6 5  lw 0 6 func2  jalr 6 7  lw 0 6 Neg1  add 5 6 5  lw 5 1 Stack  add 1 6 1  beq 1 0 end  beq 0 0 loop  end halt  One .fill 1  func2 .fill Delta | alpha lw 0 6 One  sw 5 2 Stack  add 5 6 5  sw 5 4 Stack  add 5 6 5  nor 1 1 2  add 2 1 4  nor 2 4 3  lw 0 6 Neg1  add 5 6 5  lw 5 4 Stack  add 5 6 5  lw 5 2 Stack  jalr 7 6  Delta lw 0 6 One  sw 5 2 Stack  add 5 6 5  lw 0 3 Neg1  add 5 3 5  lw 5 2 Stack  jalr 7 6  Count .fill 3  Func1 .fill alpha  Neg1 .fill -1 |

Determine whether each of the following registers is caller / callee save and the number of load/store pairs that occur when the program is executed.

| **Register** | **Caller or Callee** |
| --- | --- |
| 1 |  |
| 2 |  |
| 4 |  |
| 6 |  |

# Problem 6: Addi it up 15 points

In this problem, you will implement some C code using a new ISA. The specifications for the ISA are below.

* Like LC2K, there are 8 registers, in the range [0-7]
* Registers and Instructions are 16 bits
* Immediates are 6-bit 2’s complement numbers
* **Memory is byte addressable**
* Memory accesses must be aligned

| Instruction Type | Parameters | Example |
| --- | --- | --- |
| R (Register) | regA regB destReg | nand 1 2 3 |
| I (Immediate) | regA regB immediate | lw 0 1 label |
| O | <none> | halt |

Here is a subset of instructions relevant for the following problem:

| Name | type | description |
| --- | --- | --- |
| nand | R | destReg = ~(regA & regB) |
| lsl | R | destReg = regA << regB |
| addi | I | regB = regA + immediate |
| lh | I | regB = mem[regA + immediate] (2 bytes transferred) |
| beq | I | Branch to PC + 2 + immediate if regA == regB |
| ret | O | Returns to calling function - return address read from special purpose register |

The following code snippet includes a function score\_attendance, which is used to determine the number of labs attended by a student. There are 16 labs in the term, with each lab attendance grade corresponding to a bit in attendance.

1. Fill in the blanks to complete the implementation for score\_attendance as a standalone function that stores the return value sum in register 7. At the beginning of the function execution, you may assume that register 0 contains 0, and register 1 contains id. All registers are caller-saved. Only use instructions from the above subset. *[12 points]*

| #include <stdint.h>  #define CLASS\_SIZE 64  struct student {  uint8\_t credits; // unused  uint16\_t attendance;  };  struct student class[CLASS\_SIZE];  /\* returns sum of attendances for one student across all 16 labs \*/  uint16\_t score\_attendance (  uint16\_t id) {  uint16\_t sum = 0;  uint8\_t labNum = 16;  while(labNum != 0) {  labNum--;  if (class[id].attendance  & (1 << labNum)) {  sum++;  }  }  return sum;  } | # init sum and labNum  addi 0 7 #0  addi 0 2 #16  loop beq \_\_\_ \_\_\_ end  addi 2 2 #\_\_\_  # load attendance  addi \_\_\_ 3 #\_\_\_  lsl 1 3 3  addi 3 4 #\_\_\_  lh 4 5 class  # apply mask  addi 0 6 #1  lsl 6 \_\_\_ 6  nand \_\_\_ \_\_\_ 6  \_\_\_ \_\_\_ \_\_\_ 6  beq 6 0 \_\_\_  addi \_\_\_ \_\_\_ #1  # next iteration  endif beq 0 \_\_\_ loop  end ret  class .fill ... |
| --- | --- |

1. When assembling the following code using this new ISA, what numeric immediate should the assembler replace the label "target" with? *[2 points]*

| beq 0 0 target  add 0 0 0  target ret | immediate =   |  | | --- | |
| --- | --- | --- |

# Problem 7: Just a Bit Clearer 25 points

Many versions of the ARM ISA provide a BIC, or Bit Clear, instruction that operates similar to the “reset” signal in an SR latch, clearing selected bits to 0 and leaving others unmodified. After seeing its great utility, we want to implement this instruction in LC2K.

In LC2K, bic will be an I-type instruction. For each bit that is a 1 in the contents of register A or in the memory value pointed to by offset, the corresponding bit in the contents of register B will be reset to 0.

We can represent this operation with the following C code:

regB = regB & ~(regA | Mem[offset]);

By DeMorgan’s laws, this can also be represented as:

regB = regB & ~regA & ~Mem[offset];

Or:

regB = ~(~regB | regA | Mem[offset]);

For example, the following bic with the corresponding .fill would clear bits 7-4 of register 1:

bic 0 1 bmask

…

bmask .fill 240 //0xF0

**a)** To help you understand the data operations in this instruction, express bic with *only* 2-input bitwise NOR gates. Assume A is the contents of register A, B is the contents of register B, and M is the contents in memory pointed to by offset. *[2.5 points]*

![](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR4XmP4//8/AwAI/AL+GwXmLwAAAABJRU5ErkJggg==)

**b)** To perform part of the bic operation, the ALU has been extended to support bitwise **AND**, as well as the original addition and bitwise NOR. Modify the single-cycle datapath to support bic and **ALL** original LC2K instructions except jalr and halt with minimal hardware. You may add MUXes and connections inside the dotted box. You do not need to show any control signals. *[7 points]*

![](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR4XmP4//8/AwAI/AL+GwXmLwAAAABJRU5ErkJggg==)

**c)** After analyzing the original single-cycle datapath and your modified datapath, we have found that the clock period for the original design was 80ns and the clock period for the modified design is 144ns. We have a large program where X% of the instructions are bic instructions. To compare the designs, we have another version of the program that replaces all bic instructions with **5** other instructions, which we run on the original design.

Circle one & fill in the blank: More / less than \_\_\_\_\_\_\_% of the instructions executed in the program must be bic instructions for the program to run faster on the modified design. Show your work above. *[4 points]*

**d)** It turns out that very few programs satisfy the constraint you found in part (c). To get better performance, a multi-cycle design might be better. The diagram (**provided as a separate reference sheet**) shows the multi-cycle datapath, modified with new connections to implement the bic instruction. As in part (b), the ALU supports bitwise **AND** with control signals 0b10.

Provide a sequence of operations that implement the bic instruction in 6 cycles. You might not use all blanks. Then fill out the empty boxes in the control ROM for the bic instruction. *[7.5 points]*

Cycle 1 (Instruction Fetch): Instr Reg = Mem[PC], ALU\_result = PC + 1

Cycle 2 (Instruction Decode): PC = ALU\_result, Read Register Values

Cycle 3: \_\_\_\_\_\_\_\_\_\_\_\_ = \_\_\_\_\_\_\_\_\_\_\_\_, \_\_\_\_\_\_\_\_\_\_\_ = \_\_\_\_\_\_\_\_\_\_\_\_

Cycle 4: \_\_\_\_\_\_\_\_\_\_\_\_ = \_\_\_\_\_\_\_\_\_\_\_\_, \_\_\_\_\_\_\_\_\_\_\_ = \_\_\_\_\_\_\_\_\_\_\_\_

Cycle 5: \_\_\_\_\_\_\_\_\_\_\_\_ = \_\_\_\_\_\_\_\_\_\_\_\_, \_\_\_\_\_\_\_\_\_\_\_ = \_\_\_\_\_\_\_\_\_\_\_\_

Cycle 6 (Writeback): Register B = ALU\_result

| bic Cycle | PC en | MUX addr | MEM en | MEMr/w | IR en | MUX dest | MUX data | RFile wr en | MUX alu 1 | MUX alu 2 | ALU op |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 1 (IF) | 0 | 00 | 1 | 0 | 1 | X | X | 0 | 00 | 001 | 00 |
| 2 (ID) | 1 | XX | 0 | X | 0 | X | X | 0 | XX | XXX | XX |
| 3 | 0 |  |  |  | 0 | X | X | 0 |  |  |  |
| 4 | 0 |  |  |  | 0 | X | X | 0 |  |  |  |
| 5 | 0 |  |  |  | 0 | X | X | 0 |  |  |  |
| 6 (WB) | 0 | XX | 0 | X | 0 |  |  | 1 | XX | XXX | XX |

**e)** Assume that our large program has the following distribution of instructions executed:

* 30% add instructions
* 5% nor instructions
* 10% lw instructions
* 25% sw instructions
* 15% beq instructions
* 15% **bic** instructions
* The program is large enough so that the halt instruction is negligible.

Find the CPI of this program on the new multi-cycle design, assuming that for the original LC2K instructions, the new design behaves exactly like the one described in class. Write down an exact decimal number for your answer. Show your work. *[4 points]*

CPI = \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

![](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR4XmP4//8/AwAI/AL+GwXmLwAAAABJRU5ErkJggg==)